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Abstract 

This study provides a first, comprehensive, diagnostic use of DNA barcodes for the Canadian fauna of noctuoids or "owlet" 
moths (Lepidoptera: Noctuoidea) based on vouchered records for 1,541 species (99.1% species coverage), and more than 
30,000 sequences. When viewed from a Canada-wide perspective, DNA barcodes unambiguously discriminate 90% of the 
noctuoid species recognized through prior taxonomic study, and resolution reaches 95.6% when considered at a provincial 
scale. Barcode sharing is concentrated in certain lineages with 54% of the cases involving 1 .8% of the genera. Deep 
intraspecific divergence exists in 7.7% of the species, but further studies are required to clarify whether these cases reflect 
an overlooked species complex or phylogeographic variation in a single species. Non-native species possess higher Nearest- 
Neighbour (NN) distances than native taxa, whereas generalist feeders have lower NN distances than those with more 
specialized feeding habits. We found high concordance between taxonomic names and sequence clusters delineated by the 
Barcode Index Number (BIN) system with 1,082 species (70%) assigned to a unique BIN. The cases of discordance involve 
both BIN mergers and BIN splits with 38 species falling into both categories, most likely reflecting bidirectional 
introgression. One fifth of the species are involved in a BIN merger reflecting the presence of 158 species sharing their 
barcode sequence with at least one other taxon, and 189 species with low, but diagnostic COI divergence. A very few cases 
(13) involved species whose members fell into both categories. Most of the remaining 140 species show a split into two or 
three BINs per species, while Virbia ferruginosa was divided into 16. The overall results confirm that DNA barcodes are 
effective for the identification of Canadian noctuoids. This study also affirms that BINs are a strong proxy for species, 
providing a pathway for a rapid, accurate estimation of animal diversity. 
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introduction 

DNA barcoding has established itself as a powerful tool for 
species identification and discovery [1] with varied applications, 
especially in species-rich groups. Prior work on DNA barcoding of 
butterflies and moths (Lepidoptera) has investigated taxa with high 
morphological variability [2,3], has linked immature stages with 
adults [4], has examined species of biosecurity concern [5-7] and 
sexual dimorphisms [8]. DNA barcoding has also aided the 
discovery of new species [9, 10] and is accelerating their description 
[11-14]. Although there are situations in which DNA barcoding 
does not deliver species-level resolution [15-18], they seem 
infrequent, and most cases involve a small group of closely allied 
species. 

Because of the effectiveness of DNA barcoding and its diverse 
applications, efforts are underway to assemble comprehensive 
DNA barcode reference libraries at both national and continental 
scales. Although these libraries are complete for some groups of 
vertebrates in certain geographic realms (e.g., the birds of North 



America), no major invertebrate group has seen similar analysis. 
The present study begins to address this gap by providing barcode 
coverage for Canadian Noctuoidea (hereafter noctuoids), the most 
diverse superfamily of Lepidoptera. With nearly 50,000 described 
species [19], noctuoids are an important component of terrestrial 
ecosystems. They are also one of the most destructive groups of 
agricultural pests [20]. Although knowledge of global noctuoid 
diversity is relatively poor, the fauna of North America [21-25], 
especially Canada [26-30], is well known. Among the 3700 
noctuoid species from North America [25], 1555 occur in Canada 
including representatives from five of the six noctuoid families 
(Fig. 1, Table 1). The taxonomic maturity and high diversity of 
Canadian noctuoids provide an excellent system for assessing the 
performance of DNA barcodes in species discrimination. 

Prior barcode studies on Lepidoptera have demonstrated that 
DNA barcode libraries deliver high species resolution, but most 
investigations have examined small geographic areas or only a 
fraction of the species in a target assembly. For example, prior 
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328 Erebidae 



Figure 1 . Phylogenetic hypothesis and species richness of Canadian Noctuoidea. Number of species known from Canada for five noctuoid 
families, as well as the family-level phylogeny [64]. 
doi:1 0.1 371/journal.pone.0092797.g001 



work on North American Lepidoptera examined just 20% of the 
species known from the eastern third of the continent [31]. 
Although this study reported 99% success in species identification, 
cases of incomplete resolution might well rise with increasing 
taxon coverage. Other taxonomically comprehensive studies have 
revealed 90-99% success [32-35], but they targeted relatively 
small areas so they do not rule out the possibility that resolution 
may drop with increasing geographic scope. The present study 
examines the impacts of increasing taxon coverage and geographic 
scale by examining barcode resolution for nearly all Canadian 
species of noctuoids. 

Aside from enabling a test of barcode performance in a diverse 
species assemblage at a large geographic scale, the present results 
provide a good opportunity to examine the performance of the 
Barcode Index Number (BIN) System, an interim taxonomy that 
assigns specimens to sequence clusters termed BINs [36]. The BIN 



system aggregates individuals sharing similar COI sequences using 
single linkage clustering and a graph analytical approach, and the 
members of a BIN often correspond to recognized species in 
groups with strong taxonomy. It has been proposed that the BIN 
system can accelerate taxonomic progress in groups that have seen 
littie investigation by providing a tool for aggregating specimens 
that are likely to be conspecific [36] . Although the BIN system has 
been recently implemented, its performance needs further 
evaluation. By testing the concordance between BIN membership 
and morphospecies boundaries in well-studied lineages, such as 
Canadian noctuoids, the utility and constraints of the BIN system 
for species delineation in lesser-known groups can be evaluated. 
Furthermore, the rich biological data available for this econom- 
ically important taxon allow for the investigation of the link 
between feeding habits (i.e., specialized versus generahst) and 
barcode divergences (i.e., Nearest-Neighbour distances). 



Table 1. Summary of barcode coverage for Canadian noctuoid species including the source of specimens, Nearest-Neighbour 
distances, and the percentage of species in each family identifiable with barcodes. 



Mean Nearest- 

CAN species/barcode Origin of specimens # DNA Neighbour % ID Species sliaring 

Family coverage (Canada/USA/other) sequences Distance success barcodes 



Notodontidae 


57 / 57 


53/4/0 


1650 


4.73 


100 


0 


Euteliidae 


8 / 8 


5/3/0 


90 


5.80 


100 


0 


Nolidae 


17 / 17 


16/1/0 


220 


4.08 


100 


0 


Noctuidae 


1145 / 1133 


1001/132/0 


21726 


3.01 


91.10 


101 


Erebidae 


328 / 326 


258/64/4 


6839 


3.49 


82.5 


57 


Total 


1555 / 1541 


1333/204/4 


30525 


3.19* 


90.0* 


158 



Asterisks indicate weighted means. 
doi:l 0.1 371/journal.pone.0092797.t001 
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Materials and Methods 

Sampling strategy and geographic coverage 

With a surface area of 9.984 million km^ and a maximum 
breadth of 9306 km, Canada is the world's second largest country. 
It includes four biomes: tundra (arctic and alpine), forests 
(temperate and boreal), deserts (cold and semiarid), and grasslands 
(mixed and fescue Prairie; taUgrass Prairie; and bunchgrass/ 
sagebrush). About 50,000 insect species occur in Canada, and 
Lepidoptera comprise nearly 10% of this total [29] with one third 
(1555 out of 4700) of these species being noctuoids (Fig. 1). The 
present study involved the analysis of 30,525 specimens with 
86.8% derived from Canada (1333 species; about 28,000 
sequences) (Data set SI). The Canadian National Collection of 
Insects, Arachnids, and Nematodes made the largest contribution 
of museum specimens (5976), while the Biodiversity Institute of 
Ontario provided 19,993 freshly collected individuals. Specimens 
were analyzed from the full geographic and habitat range of each 
species within Canada whenever possible (Data sets SI and S2). 
However, coverage for some taxa could only be gained by 
analyzing specimens from other nations (Table 1). Most of these 
'extra-territorials' derived from the USA (204 species, 2419 
specimens), but 69 Eurasian specimens were analyzed for three 
introduced species that are very rare [Parascotia fuliginaria) or 
extirpated (Euprodis chrysonhoea, Euproctis similis). Finally, barcodes 
were obtained from 23 Neotropical specimens for two species 
(Eudocima apta, Hypocala andrmond) that are extremely rare migrants 
to Canada and the USA (Data set SI). The inclusion of extra- 
territorial specimens was justified by examining sequence variation 
in other species with barcode records from both Canada and 
United States; this analysis did not reveal significant sequence 
divergence linked to their nation of origin. AU specimens were 
identified and validated by co-authors JDL and BCS; genitalia 
dissections were made when necessary. Taxonomy (see Data set 
S2) follows the most recent checklist of the Noctuoidea of North 
America north of Mexico [23-25]). 

Data acquisition and analysis 

DNA extraction, PGR amplification, and sequencing of the 
COI barcode region were performed at the Canadian Centre for 
DNA Barcoding (CCDB) and followed standard protocols [37- 
41]. PGR and sequencing generally used a single pair of primers: 
LepFl (ATTGAAGGAATGATAAAGATATTGG) and LepRl 
(TAAACTTCTGGATGTGCAAAAAATCA) [2] which recovers 
a 658 bp region near the 5' end of COI including the 648 bp 
barcode region for the animal kingdom [1]. For museum 
specimens older than ten years, primer pairs designed to amplify 
smaller overlapping fragments (307 bp, 407 bp) were employed 
[41]. 

Data set SI provides details (e.g., voucher codes, higher 
taxonomy, repository institutions, COI sequence length, collection 
dates and collection data) on all barcoded specimens; residual 
DNA extracts are stored in the DNA Archive at the CCDB. All 
new sequences are deposited in GenBank with accession numbers 
available in Data set S3. Specimen data including images, details 
on the voucher repositories, GPS coordinates for collection sites, 
sequence records, trace files, and GenBank accession numbers are 
available in the Barcode of Life Data Systems (BOLD, www. 
boldsystems.org) in two public datasets: DS-CANNOCl 
(dx.doi.org/10.5883/DS-CANNOCl) and DS-CANNOC2 
(dx.doi.org/10.5883/DS-CANNOC2). The number of barcode 
sequences per species varies from 1 to 508 (mean= 19.8) (Data set 
S2). Only sequence records greater than 500 bp (range 500 bp- 
658 bp), those that meet length and quality requirements of the 



BARCODE data standard [42], are included. Of the 1555 species 
known from Canada (Data set SI), only 14 extremely rare species 
now lack barcode coverage. They include two Erebidae [Grammia 

philipiana, Hypena modestoides) and 1 2 Noctuidae {Acronida falcula, 
Agrotis kingi, Annaphila danistica, Eupsilia fiingata, Lasionyda illima, 
Lasionyda mackani, Melaporphyria immortua, Papaipema aerata, Papai- 
pema pertincta, Pyreferm ceromatka, Xestia fergusoni, and Xestia 
staudingerij. 

Tests of barcode performance were made at a national level 
using the species list for Canada and for three regions (British 
Columbia, Ontario, New Brunswick/Nova Scotia) based on 
current barcode coverage for each area. Coverage was available 
for 668 of the 800 species known from British Columbia, for 617 of 
the 867 species from Ontario and for 387 of the 585 species known 
from New Brunswick and Nova Scotia. Patterns of intra- and 
intersjK'cific sc'qucncc variation were explored at various taxo- 
nomic levels using the Kimura-2-Parameter (K2P) distance model 
and the neighbor-joining (NJ) algorithm calculated using analytical 
tools on BOLD. For a few taxa with either low or deep sequence 
divergence, model-based phylogenetic analysis (i.e., maximum 
likelihood, ML) was empkiyed to examine patterns of intraspecific 
variation and relationships with sister species in more detail. For 
the study of association between host plant use and barcode 
divergences, we divided host plant types into four major categories 
1) monocots (primarily grasses) & herbaceous dicots, 2) trees & 
shrubs, 3) detritus, fungi & lichens, and 4) generalists. Generalist 
feeders are those species that consume a broad range of monocots 
and dicots, often both herbaceous and woody plants. The 
significance of dififerences in interspecific (i.e., NN distances) and 
intraspecific variation among the four categories was assessed 
using nonparametric tests (e.g.. Mood's Median Test). To dealing 
with the problem of unequal variances and sample sizes in NN 
distances and intraspecific data, unequal variance t-test and 
random sample of cases was also employed. And finally, to assess 
the correlation between genus size and barcode-sharing incidence, 
we performed a nonparametric correlation test (Spearman) in 
SPSS vl8 (IBM). 

Results 

Barcode Performance 

DNA barcodes were obtained for 1541 of the 1555 noctuoid 
species known from Canada. No indels causing frameshifts or stop 
codons were detected among the 30,525 sequences recovered from 
these taxa suggesting that they derive from COI rather than a 
pseudogene. Most species (90%) have diagnostic barcode sequenc- 
es when considered from a Canada-wide perspective (Trees SI, 
S2, S3, S4, and S5). Identification success was even higher when 
analysis was restricted to a particular region with 95.3% success for 
New Brunswick and Nova Scotia (369/387), 96% for Ontario 
(592/617), and 95.4% for British Columbia (637/668) (Table SI). 
Mean Nearc'st-N(;ighl)()ur (NN) distances showed modc'st \'ariation 
among the families with more than 50 species, ranging from a low 
of 3.01% in Noctuidae to a high of 4.73% in the Notodontidae 
(Table 1); the Euteliidae had a slightly higher NN distance (5.8%), 
but the family was represented by only a few species. There was 
significant variation in barcode performance among families 
(X^ = 38.3, p<0.0001). Species in three families (Euteliidae -8 
species, Nolidae -17 species, Notodontidae -57 species) were 
perfectly discriminated by barcode sequences, but 8.9% (101/ 
1133 species) of the Noctuidae, and 17.5% (57 out of 326) of die 
Erebidae could not be discriminated because of barcode sharing 
by two or more species (Table S2; Tree S4 and S5). The incidence 
of barcode sharing seemed to be associated with the number of 
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Figure 2. Impact of genus size on DNA barcode performance. The relationship between the number of species in a genus (plotted on a log2 
scale) and the incidence of barcode sharing. Values above the bars indicate the number of genera and the number of species in each log2 category. 
doi:10.1371/journal.pone.0092797.g002 



species in a genus (Fig. 2), however, statistical tests reject this 
hypothesis (Spearman Correlation Coefficient = 0.22; p = 0.24): 
15.6% of those in the 17 most diverse genera (16-123 species) and 
8. 1 % of the species in genera with two to fifteen species shared 
their barcode with at least one other taxon. 

Cases of Barcode Sharing 

The 57 cases of barcode sharing among the 326 species of 
Erebidae involved taxa in 1 1 of its 109 genera (Table S2). Twenty- 
two of these cases involved assemblages of two to four species in 
nine genera [Arctia -3, Dasychim —2, Dodia -2, Haploa -4, Idia -1, 
Pararctia -2, Spilosoma -3, Virbia -2, ^anclognatha -2), while the other 
35 cases involved members of just two genera - Gmmmia and 
Catocala. The 1 1 cases in Grammia involved three haplotype clusters 
shared by two to seven species, while the 24 cases in Catocala 
included five sequence clusters with two to eight species. The most 
dramatic cases of sequence sharing in Catocala involved assem- 
blages of species which feed on the same food plant. For example, 
eight hickory-feeding species [Carya, Juglandaceae) (C flebilis, C. 
habilis, C. Judith, C. obscura, C. residua, C. retecta, C. robinsonii, C. vidua) 
possess closely similar or identical barcodes, while another 
barcode-sharing assemblage of six species (C. califomica, C. briseis, 
C. faustina, C. grotiam, C. hermia, C. semirelicta) feeds on willows and 
poplars (Salicaceae) [43^5]. 

The 101 cases of barcode sharing among the 1133 species of 
Noctuidae (Table S2) involved taxa in 29 of its 248 genera 
{Abagrotis —8, Acronitta —2, Agrotis —2, Agriopodes —2, Alypia —2, 
Amphipoea -2, Apamea -2, Bdlura -2, Copablepharon -2, Dargida -2, 
Epidemas -2, Eremobina -2, Eupsilia -2, Euxoa -17, Hyppa -2, 
Ipimorpha -3, Lasionycta -7, Lithophane -5, Mythimna -2, Panthea -3, 
Papaipema -2, Polia -2, Resapamea -2, Rhyacia -2, Sunira -2, Sympistis 
—2, Syngrapha -4, Trichordestra -2, Xestia -12). Some large genera, 
such as Acronicta and Sympistis, which include 48 and 52 Canadian 
species respectively, showed a very low incidence of barcode 
sharing (just two species each). By contrast, nearly half of the cases 
of barcode sharing in this family involved just four genera (8/26 
species oi Ahagrotis, 17/123 species oi Euxoa, 7/34 species of 



Lasionycta, 12/45 species oi Xestia). Most of these cases of barcode 
sharing involved very morphologically similar species, but there 
were exceptions. For example, Lasionycta taigata and L. skraelingia 
are morphologically distinct sister species, but they share barcodes. 

Cases of Low Barcode Divergence 

Twenty-seven genera [Anarta, Caradrina, Cissusa, Cosmia, Cucullia, 
Dasychira, Datana, Diarsia, Egira, Enargia, Euclidia, Eupsilia, Feltia, 
Feralia, Hadena, Hypoprepia, Leucania, Moarctia, Papestra, Phragmatobia, 
Schinia, Setagrotis, Spaelotis, Symmerista, Sympistis, Xylena, ^ale) included 
two or more species with low divergence, but with no evidence of 
shared sequences (Table S3). Species of Lasionycta provide a key 
example of low divergence coupled with a few cases of sequence 
sharing (Fig. 3). 

Cases of Deep Intraspecific Sequence Divergence 

Deep (>2%) barcode divergence was detected in 119 (7.7%) 
species and another 21 species showed sufficient divergence 
(1.2%-1.9%) for their members to be assigned to two BINs. These 
140 taxa included representatives from 83 of the 387 genera of 
noctuoids and most were partitioned into two (100) or three (30) 
BINs, but 10 were placed in four or more (Table S4). Virbia 
ferruginosa showed exceptional diversity with its members assigned 
to 16 BINs. The cause of this remarkable molecular variation is 
currently not clear, but taxonomic study (BCS) suggests that this 
variation is not linked to cryptic species. Although many of the 140 
cases require more investigation. Table 2 lists 12 species where 
biological covariates are associated with barcode clusters, indicat- 
ing that unrecognized species are known or probable. For 
example, specimens in the 1 1 barcode lineages of Ldia lubricalis 
show differences in external and genitalic morphology, and 
include a number of unrecognized species (BCS, in prep.). 

Factors Influencing Nearest-Neighbour Distances. Two 
factors were found to impact Nearest-Neighbour (NN) distance. 
Firstly, the 26 species of non-native Canadian noctuoids [23-25] 
possess a significantly (X*= 17.53; Median = 2.95; p<0.0001) 
higher NN distance (x = 5.9%) (Table 3) than native species 
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^L. uniformis (CO, AB, BC, WY) 
L.geUda(BC) 

"I L. carolynae (CAN) 

1 ^L. lagganata (BC) 

|L. uniformis (UT,BC) 
L. uniformis (CA) 
JL. uniformis (CA, BC, WA) 
^L. discolor (CO, WY) 

[L. quadrilunata (CO, AB) 

L. quadrilunata (Yukon) 

L.phoca(QC) 
L.phoca(NL) 

L.phoca(MB) 
L.dolosa(CO) 
^L. sierra (CA) + L. pulverea (AB) 
L. brunnea (AB, BC, WA) 
L. promulsa (BC, AB, UT, CO) 
p^L. impingens (AB, BC, WY) 

I |L. impingens curta (CO) 

L .-^ L. silacea (AB, BC, WA) 
^L. caesia (BC) 



-|L. subfiiscula (CO) 

^L. subfiiscula (OR, WA, BC) 



•IL. subfiiscula (OR, UT) 
JL. subfumosa (CAN) 



s^^^^li. dovrensis (FIN) 

L. coracina (Yukon) + L. leucocycla (MB) 
L. staudingeri (AK, Yukon, Nunavut, Northwest Territories) 
— ^L. benjamini (CO) 



L. benjamini (CA) 

L.subalpina(WY,TX) 
L. anthracina (QC) + L. leucoii^la (QC) + L. flanda (NFL) 
L. leucocycla (BC, Nunavut, Yukon) 
L. fi-igida 
II. fi-igida 

L. leucocycla (MB, ON) 
L. leucocycla (Yukon) 

L. leucocycla + L. anthracina (AB) 

L. subfiiscula (BC, AB) 

L. sasquatch ( WA) + L. poca (Yukon, BC, AB) 
L. coloradensis 

--^ L. perplexella 

L. perplexa 





0.004 



Figure 3. Low sequence divergence in Lasionycta. Maximum likelihood tree (COI barcode) for Lasionycta demonstrating very low sequence 
divergences and cases of overlapping or shared haplotypes. Terminals with vertical bars indicate one or few samples shared identical haplotype, 
those with trianglesrepresenting collapsed haplotypes with less than 2% sequence divergence. Geographic origin is given in brackets as standard 
abbreviations for provinces (Canada) or states (USA); FIN = Finland. 
doi:1 0.1 371 /journal.pone.0092797.g003 



(x= 3.02%). Secondly, there is evidence of an association between 
food plant usage and interspecific (i.e., NN distance) divergences. 
Records on host plants are available for about 80% of Canadian 
noctuoids [43-45], permitting their assignment to one of four host 
plant categories 1) monocots & herbaceous dicots, 2) trees & 
shrubs, 3) detritus, fungi & hchens, and 4) generalists. Generahst 
feeders possessed a lower NN distance (2.09%) than species in the 
other feeding categories (Fig. 4), and both nonparametric test and 
analysis of variance of random samples with equal size indicated 
that this difference was sig nificant (X^ = 94.89; Median = 3.09; 
p<0.0001) (Tables 4-7). The levels of intraspecific variation was 
significantly (X^ = 10.03; Median = 0.16; p<0.018) lower among 



grass/herbaceous feeders (first category = 0.27%) than in other 
categories (category 2 = 0.33%; 3 = 0.51%; 4 = 0.48%). As dis- 
cussed under 'Barcode Performance', genus size can also affects 
NN distance, with large genera presumably having higher rates of 
barcode-sharing (Fig. 2) which would intuitively mark a decrease 
in NN values. Nevertheless, statistical tests revealed that this 
association is not significantly supported. 

Congruence Between Species Boundaries of Recognized 
Species and BINs. We found close correspondence between 
the number of species (1541) analyzed and the number of BIN 
(1515) assignments (Table 8). However, the strength of this 
congruence was partially a consequence of the counterbalancing 
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Table 2. Twelve Canadian noctuoids with deep {>2%) intraspecific barcode variation that also show morphological divergence 
between their barcode clusters. 





Family 


Subfamily 


Species Auth 




#of 
dusters 


%Sequence 
divergence 


Condition 


Notodontidae 


Notodontinae 


Furcula cinerea (Walker, 1865) 




5 


2.7 


taxonomic status under revision 


Notodontidae 


Notodontinae 


Furcula occidentalis {Lintner, 1878' 




5 


2.7 


taxonomic status under revision 


Notodontidae 


Notodontinae 


Pheosia rimosa Packard, 1864 




8 


5.1 


one haplotype seems to be a good species 
(P. portlandia) - under revision 


Nolidae 


Chloephorinae 


Nycteola n. sp. 




2 


3 


a possible new species from BC and CO 3% 
diverged from sister species N. fletcheri 


Erebidae 


Herminiinae 


Idia lubricalis (Geyer, 1832) 




n 


3.8 


species complex includes various form 
(size, colour, maculationsand etc.) - needs 
to be studied 


Erebidae 


Herminiinae 


Idia americalis (Guenee, 1854) 




3 


1.8 


biological evidence for cryptic species 
(i.e., pheromones), despite low intraspecific 
barcode divergence 


Erebidae 


Hypenodinae 


Hypenodes n. sp. 




5 


2.1 


five undescribed species 


Erebidae 


Erebinae 


Melipotis perpendicularis {Guenee, 


1852) 


4 


2.3 


species complex with various haplotypes 
of 1.45% intraspecific variation 


Erebidae 


Erebinae 


Caenurgina crassiuscula (Haworth, 1809) 


5 


2.95 


species complex with various diverged 
haplotypes of 2.95% intraspecific variation 


Noctuidae 


Noctuinae 


Anarta crotchii {Grote, 1 880) 




2 


43 


two distinct barcode clusters of 3.6% sequence 
divergence - barcode clusters do not match 
the morphotypes 


Noctuidae 


Noctuinae 


Papoipema pterisii Bird, 1907 




6 


1.95 


one diverged haplotype seems related to 
P. pterisii but with different feeding habits 
(ostrich fern) 


Noctuidae 


Noctuinae 


Lacinipolia strigicolHs (Wallengren, 


1860) 


4 


5.4 


one diverged haplotype of 5.4% different - no 
obvious difference in external or internal 
morphology, distribution 



doi:l 0.1 371 /journal.pone.0092797.t002 



effects of BIN splits and mergers. In actuality, perfect correspon- 
dence between the assignment of specimens to a particular species 
and their placement in a unique BIN was only evident for 1082 of 
the 1541 species (70%). Another 140 species (including all 119 
species with >2% intraspecific sequence divergence) were 
involved in splits with their members assigned to two (100 species), 



three (30 species), or more BINs (10 species). Finally, 348 species 
were involved in a merger where they were placed in a BIN that 
included at least one other species. Some mergers involved species 
(158) that shared barcodes with at least one other taxon (Table S2), 
but most (189) involved species with diagnostic but low barcode 




Figure 4. Impact of host plant type on NN distances. Nearest-Neighbour (NN) distances for species of Canadian noctuoids using four food 
plant categories: 1) monocots or herbaceous dicots, 2) trees or shrubs, 3) detritus, fungi and lichens, and 4) generalist. Values above the bars indicate 
the number of species in each food plant category (n), average of NN/standard errors (SE). 
doi:1 0.1 371/journal.pone.0092797.g004 



PLOS ONE I www.plosone.org 



6 



March 2014 | Volume 9 | Issue 3 | e92797 



DNA Barcoding Noctuoidea of Canada 



divergence (Table S3). A very few cases (13) involved species 
whose members fell into both categories. 

Discussion 

As revealed by this study and other investigations, the results of 
large-scale DNA barcode analyses never perfectly replicate 
existing taxonomic systems; they reveal both instances of deep 
intraspecilic sequence divergence and other cases where members 
of different species share the same barcode sequence. In the 
present study, DNA barcodes differentiated more than 95% of 
currently recognized noctuoid species when considered at a 
provincial level (Table SI), and 90% when examined for the whole 
of Canada. The modest decline in identification success with 
increased geographic scale reinforces an earlier conclusion, based 
on a much smaller dataset, that increased geographic sampling 
does not seriously diminish the performance of DNA barcodes 
[46] . Moreover, the resolution obtained for Canadian noctuoids is 
similar to that observed for other groups of Lepidoptera in other 
geographic regions. For example, deWaard et al. [35] found 93% 
resolution in a study on 400 species of Geometridae from British 
Columbia, while Hebert et al. [31] observed 99% resolution for 
1200 species in diverse families of Lepidoptera from eastern North 
America. Results from Europe show similar performance with 

Table 3. A list of introduced noctuoid species into Canada. 



90% for 185 species of Romanian butterflies [32], 98.5%, for 400 
species of Bavarian geometrids [33] and 99% for 957 species from 
a broad range of macro-Lepidoptera in the same region [34] . 

This study revealed that 7.7% of Canadian noctuoids possess 
more than 2% intraspecilic divergence with this variation falling 
into two or more discrete sequence clusters. So long as these 
clusters are 'private' to a particular species, their presence does not 
complicate the assignment of specimens to a known taxon 
although they may signal overlooked species. The incidence of 
such cases of deep divergence in Canadian noctuoids is similar to 
the 5-8% reported in earlier work on other Lepidoptera faunas 
with well-studied taxonomy [3,31,33,34]. Such cases of deep 
divergence can arise in three ways and it is important to determine 
the causal factor for each case to understand its significance. Deep 
divergences can arise through the presence of cryptic species, the 
recovery of a pseudogene, or high intraspecific variation. The 
simplest initial step to discriminate among these alternatives lies in 
examining barcode groups for diagnostic differences in external or 
genitalic morphology. Any covariation between barcode clusters 
and other traits provides strong evidence that the current 
taxonomic system has overlooked species in the group under 
investigation [2,47,48]. For example, such covariation was noted 
in 12 of the 140 species with deep 'intraspecilic' divergence in this 
study (Table 2). In cases where such variation is not apparent, it is 





Introduced species to Canada 


Approximate dates of introduction 


Barcode coverage 


NN Distance 


Agrochola lota 


1976 


X 


6.40 


Amphipyra tragopoginis 




X 


6.28 


Apamea unanimis 


1991 


X 


4.07 


Calophosia lunula 


1955 — bio-control agent 


X 


6.51 


Caradrina morpheas 


1944-1955 


X 


4.14 


Cerapteryx graminis 


1966 


X 


4.22 


Chrysodeixis chalcites 


2008 


X 


5.39 


Cucullia umbratica 


1998 


X 


4.68 


Euproctis chrysorrhoea 


1897 






Euproctis similis 


1933 






Garella nilotica 




X 


7.85 


Hydraecia micocea 


1902 


X 


1.72 


Lateroligia ophiogramma 


1989 


X 


4.67 


Leucoma salicis 


1920 


X 


12.07 


Lymantria dispar 


1868 


X 


10.08 


Noctua comes 


1982 


X 


5.23 


Noctua pronuba 


1979 


X 


4.54 


Nolo cucullatella 


2008 






Oligia strigilis 


1990 


X 


4.82 


Parascotia fuliginaria 


<1980 






Rhizedra lutosa 


1991 


X 


6.21 


Spodoptera exigua 




X 


5.72 


Tathorhynchus exsiccata 




X 


6.86 


Trichoplusia ni 




X 


6.22 


Tyria jacobaeae 


1965 — bio-control agent 


X 


7.73 


Xestia xanthographa 


1907-1950 


X 


4.13 


Total 




22 


5.89 



Their NN distance and approximate date of introduction are shown. 
doi:l 0.1 371/journa!.pone.0092797.t003 
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Table 4. Summary of analysis of variance (ANOVA) of tine reiationsliip between NN distances at CO! and larval food plant 
categories for 1196 species of Canadian noctuoids. 





Groups 


Count (1/2) 


Sum (1/2) 


Average (1/2) 


Variance (1/2) 


Grass/herbaceous 


456/73 


1608.29/243.43 


3.53/3.33 


4.75/4.20 


Tree/shrubs 


456/73 


1518.13/259.92 


3.33/3.56 


5.73/6.43 


Detritivore/fungivore/lichenivore 


73/73 


291.51/291.51 


3.99/3.99 


5.29/5.29 


Generalist 


211/73 


440.53/175.18 


2.09/2.40 


2.22/2.28 



Host plant data set was analyzed in two different ways: 1) actual data set with unequal sample size (non-normal distributed data) and 2) re-sampled data set with equal 

sample size (73 samples). 

doi:l 0.1 371 /journal.pone.0092797.t004 



important to rule out the possibility that the clusters reflect the 
recovery of the authentic COI gene firom some individuals, and a 
pseudogene fi'om others. If the analysis of a second mitochondrial 
gene (e.g., cytochrome b) also reveals deep intraspecific divergence 
and its sequence clusters correspond with those at COI, the deep 
barcode divergence is likely to be real rather than an artifact of 
variable pseudogene recovery. Subsequent analysis can then focus 
on determining if the sequence divergence at COI reflects the 
presence of sibling species or an unusually high level of 
intraspecific diversity. Such cases are best resolved through 
multi-loci analysis (e.g., a nuclear loci) [49,50] of specimens from 
geographic settings where the component lineages are sympatric. 
If an exhaustive examination of nuclear markers shows no 
differentiation between lineages, the variation at COI likely 
reflects deep intraspecific divergence, such as that reported in 
European populations of the geometrid Epirrita autumnata [51]. The 
factor(s) responsible for divergence can then be analyzed; it may 
reflect selective sweeps driven by Wolbachia [52] or secondary 
contact between lineages formerly isolated in different glacial 
refugia. Our study indicates the need for detailed analyses of this 
sort to better understand the cause and taxonomic implications of 
the deep sequence divergences in 140 species of Canadian 
noctuoids (including the 12 taxa where barcode divergence was 
linked to morphological differentiation). Virbia femginosa should be 
a priority target, given its assignment to 16 BINs and the long- 
standing taxonomic uncertainty surrounding this genus [53]. 

Because the standard criterion for the evaluation of barcode 
success involves its capacity to discriminate known species, cases of 
barcode sharing attract particular attention. This study revealed 
that 10% of Canadian noctuoids (158/1541) share their barcode 
sequence with at least one other species and that the incidence of 
such cases varies significantiy among the five noctuoid families. 
These cases of barcode sharing can have three causes; the species 
involved may be young; they may be older, but have experienced 



recent introgression; or they may actually represent a single species 
(i.e., wrong taxonomy). Lineages undergoing active speciation 
should include more species that are so young that they lack 
diagnostic COI sequences. Viewed from this perspective, the 
Notodontidae, which lacked any case of barcode sharing, has seen 
less recent speciation than the Erebidae where 17.5% of species 
share barcodes. Aside from this divergence between families, there 
was also a link to generic diversity. As might be expected, no case 
of barcode sharing involved species in monotypic genera, while its 
incidence reached 15.6% in the 17 most diverse genera (>16 
species). Genera with an intermediate species count (2-15) also 
showed an intermediate level of barcode sharing (8. 1 %), although 
there was evidence of an unexpected trend toward lower barcode 
sharing in these genera as the species count rose. Viewed from an 
overall perspective, the 'taxonomic localization' of compromised 
resolution was striking; seven of the 387 genera of noctuoids 
accounted for 54% of all cases of barcode sharing. Although each 
of these genera included a substantial number of species (range 
20-123), they only account for 21.7% of all Canadian noctuoids, 
meaning that they include a high proportion of taxa that share 
barcodes, suggestive of active or recent speciation. Cases of 
sequence sharing can also be due to oversplitting of species, 
especially in species-rich genera. A recent study that utilized both 
DNA barcoding and morphological approaches resolved several 
taxonomic issues in North American Erebidae and Noctuidae 
through the synonymization previous oversplit species [54]. Most 
current taxonomy is based on traditional morphological studies, so 
there is no correct taxonomy to act as reference system. Indeed, 
correct designation of species boundaries in high diversity genera 
usually requires comprehensive examination of reproductive 
compatibility, host plant associations, morphological characters 
and sequence divergences. Consequently, some cases of discor- 
dance between traditional taxonomy and results of DNA 
barcoding could reflect mcorrect taxonomy arising as a result of 



Table 5. Statistical results of analysis of variance (ANOVA) of the relationship between NN distances at COI and larval food plant 
categories for 1196 species of Canadian noctuoids. 





Source of Variation 


SS (1/2) 


df(1/2) 


Af5 (1/2) F (1/2) P-vaiue (1/2) 


Between Groups 


362.48/99.15 


3/3 


1 20.83/33.05 25.66/7.26 0.00/0.00 


Within Groups 


5613.26/1310.91 


1 1 92/288 


4.71/4.55 




Total 


5975.74/1410.06 


1195/291 




Host plant data set was 


analyzed in two different ways: 1) actual data set with unequal sample size (non-normal distributed data) and 2) re-sampled data set with equal 



sample size (73 samples). 
doi:l 0.1 371/journal.pone.0092797.t005 
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Table 6. Summary of nonparametric test (Mood's Median) of 



the relationship between NN distances at COI and larval food 
plant categories for 1196 species of Canadian noctuoids. 




Groups > Median 


Median 


Grass/herbaceous 271 


185 


Tree/shrubs 230 


226 


Detritivore/fungivore/lichenivore 51 


22 


Generalist 46 


165 



doi:1 0.1 371 /journal.pone.0092797.t006 



intraspecific polymorphism or overly exhaustive morphological 
studies of charismatic taxa. 

Other cases of barcode sharing may arise as a consequence of 
limited or biased sampling. In cases where only one or a few 
specimens were barcoded, it is likely that some cases of barcode 
sharing be associated with this artifact. A single individual would 
not reflect the intraspecific diversity (either morphological or 
genetic variation) of species as a whole. The collection sites (e.g., 
hybrid zones) and extreme specimens with intermediate charac- 
teristics (e.g., hybrids) can dramatically impact results. In addition, 
a single species can be assigned to a unique BIN over part of its 
geographic range, but share a BIN with a second species in 
another region [33]. Further studies (e.g., increasing taxon 
sampling and genetic markers) are needed to identify the possible 
reasons and causes for barcode sharing. As expected, NN distances 
were significantly (p<0.0001) higher for introduced than native 
species, undoubtedly reflecting the fact that many of them have left 
their sister taxon behind in Eurasia. By contrast, introduced 
species had lower intraspecific divergence (x = 0. 1 1 %) than native 
species (x = 0.39%), reflecting the expected loss of diversity as a 
consequence of founder effects. However, nonparametric tests 
indicated that this difiference was not significant (X^ = 4.30; 
Median = 0.15; p = 0.065). 

Nearest-Neighbour distances werefound to be significandy 
lower among generalist feeders than among species with special- 
ized feeding habits. This result is counterintuitive as host-plant 
specialization should foster diversification, creating assemblages of 
closely related species. Comparison of intraspecific divergences 
revealed that species feeding on grass/herbaceous possess signif- 
icandy lower intraspecific barcode divergence than species with 
other feeding behavior. This result conflicts with the usual 
expectation that species with wide niches (e.g., generalists) should 
be more variable than species with narrow niches (e.g., host-plant 
specialists) [55-56]. Taking into account that mtDNA markers 
such as the barcode region are poor candidates for assessing this 
association because of the selective sweeps on mitochondria 
regularly deletes variation [57]. However, an equally likely 
scenario is that polyphagy (generalist feeding) is actually more 
difficult from an evolutionary perspective - those species that are 
able to switch to a broad diet could equally undergo a species 
radiation (polyphagy is basically a specialized feeding strategy). 
Other results suggest the needfor a deeper investigation into the 
linkage between host plant use and barcode divergences. 

The potential causes of barcode-sharing in the genus Catocala 
appear to be particularly complex, and may include larval 
hostplant-mediated mechanisms, such as those documented in 
sawflies [58]. Dramatic cases of barcode sharing were detected 
among two groups of taxa, those feeding on hickories [Carya spp.) 
and those on poplars / wUlows (Salicaceae). The latter group 
includes parapatric species pairs that are morphologically very 



Table 7. Statistical results of nonparametric test (Mood's 
Median) of the relationship between NN distances at COI and 
larval food plant categories for 1196 species of Canadian 
noctuoids. 







N 


1196 


Median 


3.09 


Chi-Square 


94.89 


df 


3 


P-value 


0.00 


doi:l 0.1 371/journal.pone.0092797.t007 



similar (the C. briseis I califomica I grotiana complex) and might 
therefore exhibit incomplete lineage sorting due to recent or 
incomplete speciation. However, barcode sharing or overlap is 
equally prevalent among sympatric species of a second Salicaceae- 
group, and a Ccj^o-group. Species in both groups show strident 
phenotypic differences in both adults and larvae, and their status 
as bona species has not been questioned [59,60]; for example, C. 
parta / luciana / junctura / meskei and C. briseis / semirelicta have 
closely similar barcodes, but shared host plants, habitats and 
similar genitalic morphologies may facilitate hybridization. Fur- 
ther study of the 16 North American species in the Salicaceae- 
group [59,60] is needed to resolve the evolutionary history of this 
complex, particularly through nuclear gene markers and biogeo- 
graphical analysis. The same is true of the 23 species of the Carya- 
group, where at least Catocala insolabilis, C. dejecta, C. lacrymosa, C. 
palaeogama, C. retecta, C. Judith, C. robinsonii, C. obscura, C. habilis, C. 
residua, C. vidua, C. Jlebilis, and C. robinsonii form a series with 
overlapping and identical barcodes. 

Species of Grammia, a grass- and herb-feeding genus, possessed a 
particularly unusual pattern of barcode variation where species not 
only share barcodes, but often very divergent ones, suggesting that 
past hybridization events, sometimes between distantly related 
species, have led to the bidirectional introgression of mitochondrial 
genomes [18]. Broad zones of sympatry, weak divergence in 
genitalia, and overlap in pheromone usage have apparendy 
facilitated such hybridization [18]. All of these cases of barcode 
sharing require more detailed study to evaluate causal factors 
[33,34]. 

Aside from probing the efficacy of DNA barcodes as a tool for 
species identification, the present study has examined the 
correspondence between sequence clusters recognized by the 
BIN system and known species. The results of this analysis indicate 
the strong capacity of the BIN system to estimate species diversity 
(1515 BINs versus 1541 species), supporting the conclusion of an 
earlier investigation [36]. These results suggest that DNA 
barcoding is poised to resolve a long-standing question - how 
many animal species are there on the planet [11,61]? Moreover, 
the BIN system has the capacity to do more than just to deliver a 
species count when it is coupled with a weU-parameterized 
barcode reference library. In this situation, in most cases, each 
BIN can be automatically assigned to a higher-level taxon. 
Automated phylum-level assignments are now secure and class 
and ordinal placements are correct in more than 90% of cases for 
terrestrial animals (pers. obs.). Further parameterization of the 
barcode library will undoubtedly lead to robust familial assign- 
ments [62] . Although Ekrem et al. [63] correctly pointed out that 
DNA barcodes can only deliver a species-level assignment when a 
fuUy parameterized reference library is in place, the BIN system 
will provide a species count for each major compartment of 
biodiversity long before all species gain description. However, this 
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Table 8. The correspondence between the number of BINs and current species counts for five families of Canadian noctuoids. 



CDN Species Species count 

Superfamily Family richness Species coverage BINs on BOLD Notes 

Noctuoidea Notodontidae 57 57 63 62 3 subsp. +2 sp. under study 

Euteliidae 8 8 8 8 

Nolidae 17 17 20 19 2 new sp. 

Noctuidae 1145 1133 1090 1159 species complex + new sp. 

+ subsp. 

Erebidae 328 326 337 357 species complex +16 new sp. 

+ subsp. 

Total 1555 1541 1518 1605 



doi:l 0.1 371 /journal.pone.0092797.t008 

capacity will require more large-scale reference libraries such as 
the one assembled in this study. 

Supporting Information 

Table SI List of 158 species that cannot be discrimi- 
nated from one or more of their congeners with DNA 
barcodes when considered on a Canada-wide basis. 

Because the species assemblage varies regionally, the incidence of 
barcode sharing decreases when considered regionally. This table 
presents data for three regions: New Brunswick/Nova Scotia, 
Ontario, and British Columbia. 
(XLS) 

Table S2 Fifty-seven assemblages of Canadian noc- 
tuoids where two or more species share their barcode 
sequence (s). 

(XLS) 

Table S3 Seventy-six assemblages of Canadian noc- 
tuoids where two or more species possess low sequence 
divergence (<2%), but with no evidence of sequence 
sharing. Asterisks indicate cases where a species shows slight 
sequence divergence from two or more species which share 
barcodes. 
(XLS) 

Table S4 Canadian noctuoids with a maximum intra- 
specific barcode divergence >2% (121 species) or that 
were partitioned into two or more BINs (140 species). 

Asterisks indicate species with less than 2% maximum divergence. 
(XLS) 

Tree SI NJ tree for Canadian species in the family 
Notodontidae. 

(PDF) 

Tree S2 NJ tree for Canadian species in the family 
Euteliidae. 

(PDF) 

Tree S3 NJ tree for Canadian species in the family 
Nolidae. 

(PDF) 



Tree S4 NJ tree for Canadian species in the family 
Erebidae. 

(PDF) 

Tree S5 NJ tree for Canadian species in the family 
Noctuidae. 

(PDF) 

Data set SI Data set for Canadian Noctuoidea: families 
Notodontidae, Euteliidae, Nolidae, Erebidae and Noc- 
tuidae. 

(XLS) 

Data set S2 List of Canadian species of Noctuoidea and 
the number of barcode records for each taxon. Species 
with specimen barcoded from out-of-Canada are marked in green. 
Missing species (14 taxa) are in red. 
(XLS) 

Data set S3 GenBank accession numbers. 

(XLS) 
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